Hawaii 2008 Presidential Campagin Finance Analysis for president Barack Obama by Alina Jurgensen

##       cmte_id           cand_id               cand_nm     
##  C00431445:15243   P80003338:15243   Obama, Barack:15243  
##                                                           
##                                                           
##                                                           
##                                                           
##                                                           
##                                                           
##              contbr_nm          contbr_city   contbr_st 
##  KHV, KHV         :  110   HONOLULU   :6558   HI:15243  
##  HOUSTON, PATRICIA:   76   KAILUA     :1202             
##  MURAKAMI, ROBERT :   61   HILO       : 611             
##  YAMANOHA, LYN    :   60   KANEOHE    : 586             
##  SULTAN, A A      :   58   KAMUELA    : 517             
##  TYLER, CLARK     :   58   KAILUA KONA: 435             
##  (Other)          :14820   (Other)    :5334             
##    contbr_zip                     contbr_employer    contbr_occupation
##  Min.   :        0   NOT EMPLOYED         :3738   RETIRED     : 3403  
##  1st Qu.:967343212   SELF EMPLOYED        :2842   ATTORNEY    :  620  
##  Median :967688422   UNIVERSITY OF HAWAII : 452   NOT EMPLOYED:  477  
##  Mean   :882254074   INFORMATION REQUESTED: 325   PROFESSOR   :  350  
##  3rd Qu.:968165550                        : 318   TEACHER     :  340  
##  Max.   :968480001   (Other)              :7531   (Other)     :10021  
##  NA's   :3           NA's                 :  37   NA's        :   32  
##  contb_receipt_amt  contb_receipt_dt
##  Min.   :-4600.0   16-Oct-08:  313  
##  1st Qu.:   25.0   30-Sep-08:  313  
##  Median :   75.0   27-Aug-08:  299  
##  Mean   :  203.6   31-Jul-08:  282  
##  3rd Qu.:  125.0   24-Oct-08:  281  
##  Max.   : 4600.0   31-Oct-08:  238  
##  NA's   :35        (Other)  :13517  
##                                 receipt_desc   memo_cd  
##                                       :14424    :14165  
##  REATTRIBUTION/REDESIGNATION REQUESTED:  266   X: 1078  
##  REDESIGNATION FROM                   :  252            
##  REDESIGNATION TO                     :  248            
##  Refund                               :   38            
##  REATTRIBUTION FROM                   :    5            
##  (Other)                              :   10            
##                                  memo_text      form_tp     
##                                       :13040        :   35  
##  OVF TRANSFER                         : 1054   SA17A:14100  
##  REATTRIBUTION/REDESIGNATION REQUESTED:  266   SA18 : 1070  
##  ORIGINAL TRANSACTION                 :  256   SB28A:   38  
##  REDESIGNATION FROM                   :  252                
##  REDESIGNATION TO                     :  248                
##  (Other)                              :  127                
##     file_num             tran_id      election_tp 
##  Min.   :359390              :   35        :  35  
##  1st Qu.:753671   2299577    :    2   G2008:6418  
##  Median :753769   2299577RMIN:    2   P2008:8790  
##  Mean   :658074   10000818   :    1               
##  3rd Qu.:753821   10001490   :    1               
##  Max.   :754317   10001522   :    1               
##  NA's   :35       (Other)    :15201

Univariate Plots Section

My report explores the 2008 Presidential Campagin donations to president Barack Obama, in his native state Hawaii.

I start with having a look at my variables, in order to better understand my data.

The majority of countributors come from Honolulu, followed by Kailua and Hilo.

I created a new category ‘County’ and I see that most contributors come from Honolulu county.

I created a new category for occupation and I see that many contributors are from the category ‘Retired’, ‘Professional and business services’, ‘Not-employed’, ‘Educational services’, ‘Health care and social assistance’.

The primary election had received more contributions than the general. I expected that since it lasted longer, from February 2007 to June 2008.

The financial contributions increased significantly over time.

The financial support to the primary election is significantly higher.

Univariate Analysis

What is the structure of your dataset?

My dataset has 15243 observations and 18 variables. I will focus on the most relevant variables for my report which are:

  • contbr_nm: the contributer’s name
  • contbr_city: the contributer’s city
  • contb_receipt_amt: the amount
  • contbr_occupation
  • contbr_employer
  • election_tp: election type

What is/are the main feature(s) of interest in your dataset?

I would like to explore how the financial contributions variates over time. I already found that there has been a huge increase since early 2007 when he announced his candidacy until the big date of the general election. But is there a change in contributions among counties?

What other features in the dataset do you think will help support your investigation into your feature(s) of interest?

I will mainly use contb_receipt_dt, contb_receipt_amt.

Did you create any new variables from existing variables in the dataset?

I created new categories for county, employer and occupation.

Of the features you investigated, were there any unusual distributions? Did you perform any operations on the data to tidy, adjust, or change the form of the data? If so, why did you do this?

I log transformed the ‘Contribution amount’ histogram which was right skewed. I dicovered that most of the contributions are under 500 USD and I will look in more detail at these amounts.

Bivariate Plots Section

First I will have a look at density plot on contribution amounts.

The majority of conributions fall under 500 USD, with a small bump at 1000 USD and close to 2300 USD.

I am interested in looking at the frequeny of these amounts.

Most of the donations come from ‘small’ amounts of 100, 25 and 50. There are few ‘large’ donations over 250.

Most contributions happened in year 2008 and it is a high contrast from 2007. It might be explained by the fact that in 2008 took place both primary and general elections.

I now know that most donations are under 250 USD and more donations happened in 2008 than 2007, but are both years receiving ‘small’ donations?

## $`2007`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -4600.0    50.0   100.0   337.3   250.0  4600.0 
## 
## $`2008`
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -4600.00    25.00    55.71   191.60   100.00  4600.00

It is interesting to see that knowing that 2008 had received more contributions, these came from small amounts and not big donations.

For year 2007 75% of contributions are under 250 USD, compares to year 2008 with 75% of donations under $100.

Now I want to see how the contributions variates over the months for both years.

Obama annnounced his candidacy in Feb 2007. In June 2008, The Democratic primary season ends where Obama is victorious. It is than that starts a high increase in contributions continuiting to general election in November 2008. It is interesting to see the high peak following the months before election.

Does this sudden financial support started in all counties? I am curious to see how this variates by county. Are there locations more supportive? Have some places been more supportive in beginning of his campaign? Or towards the end?

Most contributions come from Honolulu county, followed by Maui.

How will the data look when I compare 2007 vs 2008?

This is an interesting graph due to the contrast between counties. Hawaii county has been the most supportive in 2007 at the beginning of campaign, more than year 2008.

Honolulu has been constant in being very supportive over time.

And you cannot ignore the huge peak for Maui in Aug 2008, the summer before the big date.

So I wonder if there is a relation between loyalty and location. Are there places with more loyal contributos? How much the loyal contributors donate?

Surprinsgly the ‘Retired’ occupation is the most loyal group with over 20 donations made by individuals.

What about employer?

‘Not employed’ and ‘self-employed’ form the most loyal groups, followed by State of Hawaii and University of Hawaii.

Are these groups algo generous, do they make big donations over 2000 USD?

I am surprised to see again that the ‘Retired’ group are not only the most loyal but also the most generous group, being the largest group with donations over 2000 USD. Also generous are the suporters with occupations as ‘Attorney’, ‘Home maker’, ‘Physician’ and ‘Not-emploed’.

Where are these loyal and generous contributors living?

## $Hawaii
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       5      25      50     107     100    2300 
## 
## $Honolulu
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.19   30.00  100.00  272.30  200.00 2575.00 
## 
## $Kauai
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     5.0    25.0    50.0   123.2   100.0  2300.0 
## 
## $Maui
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    2.27   25.00   50.00  164.40  100.00 2300.00 
## 
## $Other
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    2.67   25.00   58.79  145.20  100.00 2400.00

The most loyal and generous contributors live in Honolulu city followed by Kailua, both part of Honolulu county. Many are also living in cities of Hilo and Kmauela, which are part of Hawaii (Big Island) county.

Bivariate Analysis

Talk about some of the relationships you observed in this part of the investigation. How did the feature(s) of interest vary with other features in the dataset?

The countribution amount varies in relation to location: Honolulu city and county in general made the most contributions. Honolulu county has the highest mean. Hawaii and Honolulu counties made more contributions in 2007 than the others. Maui and Kuai had a sudden contribution in the summer before general election.

Did you observe any interesting relationships between the other features (not the main feature(s) of interest)?

It was interesting to discover the relation between location, loyaly and generosity: Honolulu is the city and county with most loyal contributors (over 20 donations made under same name) and the most generous(with contributions over 2000 USD)

What was the strongest relationship you found?

I found a strong relation between occupation and amount, as described above.

Multivariate Plots Section

I have seen that the amount variates with the location - most of the contributions came from Honolulu city, that Hawaii county was supportive in 2007 during the primary election, the sudden increase in donations in the summer before the general election for all places, that the retired group are the most loyal and generous.

So I wonder how much Hawaii and Honolulu county supported in the first vs second year and why the sudden support in the summer before general election? Who are these supporters?

Honolulu has been not only very supportive on both years, and more generous than the other counties. The 3rd Qu. is min 200 USD for both year, compares to 100 USD for the other counties.

Knowing that over time small donations played an important role, who are the supporters?

I start by looking at the occupation of those who made ‘small’ donations under 250 USD.

## $`Educational services`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     5.0    25.0    50.0   107.7   100.0  2300.0 
## 
## $`Health care and social assistance`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     5.0    50.0   100.0   237.2   250.0  2300.0 
## 
## $`Leisure and hospitality`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    25.0    25.0    50.0   228.4   250.0  2250.0 
## 
## $Manufacturing
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   25.00   25.00   50.00   40.56   50.00   65.00 
## 
## $Media
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   10.00   87.66  100.00  254.40  200.00 2300.00 
## 
## $Not_employed
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     5.0    25.0    60.0   226.4   100.0  2300.0 
## 
## $Other
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.32   25.00   60.00  204.50  114.60 2400.00 
## 
## $`Profesional and business services`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       5      50     100     422     300    2300 
## 
## $Retired
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.19   25.00  100.00  195.30  100.00 2575.00 
## 
## $Self_employed
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     5.0    25.0   100.0   420.7   250.0  2300.0 
## 
## $`State and local government`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    25.0    25.0    50.0   160.7   250.0   500.0

In the first graph, what stands out as over average is the ‘State and local governmant’ occupation for Hawaii - a supportive county in beginning and during 2007, while for Honolulu is the ‘Leisure and hosipitality’ occupation.

Looking at the mean and median, while the median is between 50 and 100 for all occupation groups, the ‘Professional and business services’ and ‘Self-employed’ have the highest mean.

Talk about some of the relationships you observed in this part of the investigation. Were there features that strengthened each other in terms of looking at your feature(s) of interest?

In my report I am exploring how the financial contributions variates over time. I saw how it variates over location. Now looking at occupation, I see how for most groups there was an increase in small donations towards the end of campaign.

Were there any interesting or surprising interactions between features?

It is interesting to see the relation between amount and occupation and how this relation variated over location and time. I can see how the ‘State and local department’ can be found only in Hawaii county with the highest median but a maximum of 500 USD.

OPTIONAL: Did you create any models with your dataset? Discuss the strengths and limitations of your model.


Final Plots and Summary

Plot One

Description One

Although 2007 is the year with a higher mean and median, the most contributions happened in 2008.

From Hawaii county came more and higher donations in 2007 - for example is the county where suporters with occupation ‘State and local government’ live and who donated more than the average.

Plot two

Description two

Most of the 2008 donations happened in the summer before the general election. The graph shows a high peak for all counties.

Plot Three

Description Three

In this campaign an important role played the small donations of individuals with different occupations acros all locations. And the ‘Retired’ group forms the largest and most loyal group, and can be found in all counties.

Reflection

I started the analysis curious to explore the variation in financial contributions over time and over locations. And since Hawaii is the native state of President Obama, I expected to find a very supportive state from early February 2007, when he announced his candidacy as a Democrat nominee.

I succedeed to get an understanding regarding the variation over location and time: Honolulu and Hawaii(Big Island) were counties supportive in beginning of 2007, but not all counties. However, larger donations were made in the first year. And Honolulu - the President’s place of origin- has proved to be the most supportive county.

I found that in the following year in 2008, the financial support increased significantly thanks to small donations, especially in the summer before the general election when all counties showed support. The supporters have various occupations like ‘Attorney’, ‘Home-maker’, ‘Physician’, ‘Educational services’ and ‘Not-employed’. But the ‘Retired’ group appears to have large donations over 2000 USD and more donations under the same name.

There are some limitations to my analysis due to lack of pre-existing categories for occupation, employer and city. Without these, even by creating new functions and subsetting the data, I stil have a large category of ‘Other’ and my analysis is incomplete.

For further analysis work, one idea would be to define a more clear profile of the contributors. To be possible, the contributors would need to provide more information about themselves, like age, gender, nationality.